Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.
translated by 谷歌翻译
具有基于物理的诱导偏见的神经网络,例如拉格朗日神经网络(LNN)和汉密尔顿神经网络(HNN),通过编码强诱导性偏见来学习物理系统的动态。另外,还显示出适当的感应偏见的神经odes具有相似的性能。但是,当这些模型应用于基于粒子的系统时,本质上具有转导性,因此不会推广到大型系统尺寸。在本文中,我们提出了基于图的神经ode gnode,以了解动力学系统的时间演变。此外,我们仔细分析了不同电感偏差对GNODE性能的作用。我们表明,与LNN和HNN类似,对约束进行编码可以显着提高GNODE的训练效率和性能。我们的实验还评估了该模型最终性能的其他归纳偏差(例如纽顿第三定律)的价值。我们证明,诱导这些偏见可以在能量违规和推出误差方面通过数量级来增强模型的性能。有趣的是,我们观察到,经过最有效的电感偏见训练的GNODE,即McGnode,优于LNN和HNN的图形版本,即Lagrangian Graph Networks(LGN)和Hamiltonian Graph网络(HGN)在能量侵犯的方面差异,该图表的差异大约是能量侵犯网络(HGN)摆钟系统的4个数量级,春季系统的数量级约为2个数量级。这些结果表明,可以通过诱导适当的电感偏见来获得基于节点的系统的能源保存神经网络的竞争性能。
translated by 谷歌翻译
在许多应用中,耗散较低但高接触面积的流体流动设备很重要。设计此类设备的众所周知的策略是多尺度拓扑优化(MTO),其中在每个离散域的每个单元格中设计了最佳的微观结构。不幸的是,MTO在计算上非常昂贵,因为在同质化过程的每个步骤中,必须对不断发展的微观结构进行均质化。作为替代方案,我们在这里提出了用于设计流体流量设备的分级多尺寸拓扑优化(GMTO)。在提出的方法中,使用了几种预选但大小的参数化和定向的微观结构来最佳填充域。 GMTO显着降低了计算,同时保留了MTO的许多好处。特别是,此处使用神经网络(NN)实施GMTO,因为:(1)可以离线执行均质化,并在优化过程中由NN使用,(2)它可以在优化过程中在微结构之间进行连续切换(3(3)(3)(3 )设计变量和计算工作的数量独立于所使用的微结构数量,(4)它支持自动分化,从而消除了手动灵敏度分析。提出了几个数值结果,以说明所提出的框架。
translated by 谷歌翻译
微观结构,即构造材料,通常是通过最大化目标(例如散装模量)的最大化,但受体积约束的影响。但是,在许多应用中,通常更适合对其他感兴趣的物理量强加约束。在本文中,我们考虑了这种广义的微结构优化问题,即任何微观结构数量,即,散装,剪切,泊松比或体积,都可以作为目标,而其余的则可以作为约束。特别是,我们在这里提出了一个神经网络(NN)框架来解决此类问题。该框架取决于微结构优化的经典密度公式,但密度场是通过NN的重量和偏见表示的。提出的NN框架的主要特征是:(1)它支持自动差异化,消除了对手动灵敏度派生的需求,(2)由于隐式过滤而不需要平滑过滤器,(3)可以轻松地将框架延伸到多个框架。 - 材料和(4)可以通过简单的后处理步骤回收高分辨率的微结构拓扑。通过各种微观结构优化问题来说明该框架。
translated by 谷歌翻译
工程设计过程通常需要优化底层几何体,同时选择合适的材料。对于某种类别的简单问题,两个是可分离的,例如,例如,可以首先选择最佳材料,然后优化几何形状。然而,一般而言,这两个不可分离。此外,材料选择的离散性质与基于梯度的几何优化不兼容,使得同时优化具有挑战性。在本文中,我们提出了使用变分性AutoEncoders(VAE)来同时优化。首先,使用数据驱动的VAE用于将离散材料数据库投影到连续和可差的潜空间上。然后将其与嵌入有限元求解器的完全连接的神经网络耦合,同时优化材料和几何形状。在优化期间利用神经网络的内置梯度优化器和背传播。使用桁架来证明所提出的框架,其中需要从数据库中选择最佳材料,同时优化桁架成员的横截面积。几个数值示例说明了所提出的框架的功效。这些实验中使用的Python代码可在Github.com/uw-ersl/matruss上获得
translated by 谷歌翻译
We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io
translated by 谷歌翻译
Applying Machine learning to domains like Earth Sciences is impeded by the lack of labeled data, despite a large corpus of raw data available in such domains. For instance, training a wildfire classifier on satellite imagery requires curating a massive and diverse dataset, which is an expensive and time-consuming process that can span from weeks to months. Searching for relevant examples in over 40 petabytes of unlabelled data requires researchers to manually hunt for such images, much like finding a needle in a haystack. We present a no-code end-to-end pipeline, Curator, which dramatically minimizes the time taken to curate an exhaustive labeled dataset. Curator is able to search massive amounts of unlabelled data by combining self-supervision, scalable nearest neighbor search, and active learning to learn and differentiate image representations. The pipeline can also be readily applied to solve problems across different domains. Overall, the pipeline makes it practical for researchers to go from just one reference image to a comprehensive dataset in a diminutive span of time.
translated by 谷歌翻译
Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. Additionally, it is extremely lightweight (0.4 MB memory requirement) and suitable for mobile and robotic applications. The dataset split and code will be made publicly available upon acceptance.
translated by 谷歌翻译
The COVID-19 pandemic created a deluge of questionable and contradictory scientific claims about drug efficacy -- an "infodemic" with lasting consequences for science and society. In this work, we argue that NLP models can help domain experts distill and understand the literature in this complex, high-stakes area. Our task is to automatically identify contradictory claims about COVID-19 drug efficacy. We frame this as a natural language inference problem and offer a new NLI dataset created by domain experts. The NLI framing allows us to create curricula combining existing datasets and our own. The resulting models are useful investigative tools. We provide a case study of how these models help a domain expert summarize and assess evidence concerning remdisivir and hydroxychloroquine.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译